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Where and how does the brain code reward during social behavior? Almost all elements of 
the brain's reward circuit are modulated during social behavior. The striatum in particular is 
activated by rewards in social situations. However, its role in social behavior is still poorly 
understood. Here, we attempt to review its participation in social behaviors of different 
species ranging from voles to humans. Human fMRI experiments show that the striatum 
is reliably active in relation to others' rewards, to reward inequity and also while learning 
about social agents. Social contact and rearing conditions have long-lasting effects on 
behavior, striatal anatomy and physiology in rodents and primates. The striatum also plays 
a critical role in pair-bond formation and maintenance in monogamous voles. We review 
recent findings from single neuron recordings showing that the striatum contains cells 
that link own reward to self or others' actions. These signals might be used to solve the 
agency-credit assignment problem: the question of whose action was responsible for the 
reward. Activity in the striatum has been hypothesized to integrate actions with rewards. 
The picture that emerges from this review is that the striatum is a general-purpose 
subcortical region capable of integrating social information into coding of social action 
and reward. 
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INTRODUCTION 

The striatum is necessary for voluntary motor control. Research 
on its role in movement planning and execution uncovered 
its participation in cognition and reward processes. Rigorous 
experimentation demanded social isolation to properly study 
this neuronal circuit. However, action, rewards and cognition 
also occur in the company of conspecifics, in a social con- 
text. Social behaviors, those behaviors that occur in a social 
context, place an extra demand on cognition since others' 
behaviors are difficult to predict and they affect our own behav- 
ior. Therefore, to understand the properties of the striatum it 
is important to study it while the organism engages in social 
behavior. Recent studies highlight this brain structure during 
different social behaviors. Among these studies, we found that 
the striatum contains neurons that signal the social action that 
will result in own reward. We place these new findings within 
the context of previous findings on the known role of this 
area in movement and reward coding in the brain. The ques- 
tion that guides the review is as follows: "does the striatum 
serve a social function?" We conclude that the striatum is a 
general-purpose subcortical region capable of integrating and 
reflecting social information into its better known non-social 
functions. 

ANATOMY AND NEUROPHYSIOLOGY OF THE STRIATUM 

The striatum is the input module to the basal ganglia, a neuronal 
circuit necessary for voluntary movement control (Hikosaka 
et al, 2000). The striatum is composed of three nuclei: caudate, 
putamen, and ventral striatum. The latter contains the nucleus 
accumbens (NAcc). The caudate and putamen/ventral striatum 



are separated by the internal capsule, a white matter tract between 
brain cortex and brainstem. 

Striatal afferents arrive from three major sources: cortex, mid- 
brain and thalamus (Selemon and Goldman-Rakic, 1985; Haber, 
2003). The cortical input from temporal, parietal and frontal 
is mostly ipsilateral (Kiinzle, 1975; Vanhoesen et al, 1981) and 
topographically arranged in the medio-lateral and dorsal-ventral 
axes (Selemon and Goldman-Rakic, 1985; Haber, 2003; Haber 
and Knutson, 2010). The striatum receives inputs from all ele- 
ments of the reward circuit (Figure 1, reviewed in Haber and 
Knutson, 2010): from striato-nigral midbrain cells (Beckstead 
et al, 1979), amygdala (Russchen et al, 1985; Fudge et al., 2002), 
orbitofrontal cortex (OFC) (Haber et al., 2006), and anterior 
cingulate cortex (ACC) (Selemon and Goldman-Rakic, 1985; 
Calzavara et al., 2007). 

The striatum has two main efferent pathways. The direct 
pathway is formed by axons of medium spiny neuron (MSN) 
expressing Dl receptors which mainly project to GABAergic neu- 
rons in the substantia nigra pars reticulata (SNr) (Parent et al., 
1984; Gerfen et al, 1990; Kawaguchi et al., 1990; Chuhma et al, 
2011). MSN that express D2 receptors mostly target the external 
segment of the globus pallidus (GPe) and form the indirect path- 
way (Parent et al., 1984; Gerfen et al, 1990; Kawaguchi et al, 1990; 
Chuhma et al., 2011). GABAeric neurons in GPe project to SNr 
and the internal segment of the globus pallidus (GPi) (Parent and 
Hazrati, 1995; Wilson, 1998). The SNr and GPi are the output 
nuclei of the basal ganglia. 

The principal cell type in the striatum is the MSN (Wilson, 
1998; Tepper and Bolam, 2004). These neurons release y-amino 
butyric acid (GABA) at their synaptic terminals (Wilson, 1998). 
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FIGURE 1 | Depiction of the brain's reward circuit highlighting the role 
of the striatum and its anatomical connections. Abbreviations: clACC, 
dorsal anterior cingulate cortex; DPFC, dorsal prefrontal cortex; vmPFC, 
ventromedial prefrontal cortex; VP, ventral pallidum; LHb, lateral habenula; 
Hypo, hypothalamus; STN, subthalamic nucleus; SN, substantia nigra; VTA, 
ventral tegmental area; PPT, pedunculopontine tegmentum. Based on 
Haberand Knutson (2010), reproduced with permission. 



The striatum contains many other cell types besides MSN, includ- 
ing cholinergic and fast-firing GABAergic interneurons (Tepper 
and Bolam, 2004). Cholinergic interneuron activity has a rela- 
tionship to reward-predicting stimuli and reward and punish- 
ment (Apicella et al., 1991b; Ravel et al, 2003). These firing 
properties suggest that these neurons may play a role in learn- 
ing (Schulz and Reynolds, 2013). Fast-firing interneurons are also 
involved in reward prediction error coding (Stalnaker et al., 2012). 
However, for brevity we will limit this review to MSN and refer 
to them as striatal neurons. Functionally, striatal neurons show 
motor and reward responses (Hikosaka et al., 2000). Functional 
and anatomical evidence led to the hypothesis that striatal activ- 
ity forms a "limbic-motor" interface (Mogenson et al., 1980). 
Neurons in the striatum integrate information about expected 
reward with motor information to guide behavior (Hollerman 
et al, 1998; Hikosaka et al, 2000; Schultz, 2000; Schultz and 
Dickinson, 2000; Goldstein et al., 2012). We review MSN neuro- 
physiological responses to action and reward in the next section. 

STRIATUM NEUROPHYSIOLOGY: ACTION AND REWARD 

The striatum contains neuronal activity related to move- 
ments, rewards and the conjunction of both movement and 
reward. Striatal neurons show activity related to the preparation, 



initiation and execution of movements (Hollerman et al, 2000). 
These neurons are also active before overt goal-directed move- 
ments (Schultz and Romo, 1988; Romo et al., 1992; Figure 2A). 
Some of these neurons are exclusively active during self- 
initiated movements, whilst other neurons are only active during 
instructed trials, and some others do not discriminate between 
self-initiated and instructed movements. In addition to this, stri- 
atal neurons also show reward related activity. Neuronal activity 
in the striatum is modulated by reward expectation indepen- 
dent of the movement necessary to obtain it (Hikosaka et al., 
1989b; Apicella et al, 1991a, 1992; Schultz et al, 1992). Striatal 
neurons that discharge after reward delivery do so in two main 
modes: phasic or tonic. Phasic responses usually have short laten- 
cies (<50ms) and are relatively short lived — median duration: 
500 ms (Apicella et al, 1991b; Hollerman et al, 1998; Lau and 
Glimcher, 2007; Figure 2B). By contrast, tonic responses have 
longer latencies and can last as long as the intertrial interval, 
i.e., up to 3 s (Apicella et al, 1991b; Hollerman et al, 1998; 
Histed et al, 2009). Furthermore, there are striatal neurons cod- 
ing which action is associated to reward and which action is 
not (Hollerman et al., 1998; Kawagoe et al, 1998; Figure 2C). 
This coding is independent of the stimuli indicating the action 
required to obtain reward (Kimchi and Laubach, 2009; Kimchi 
et al., 2009). Reward-predicting cues modulate the activity of cau- 
date neurons (Kawagoe et al., 1998; Lauwereyns et al., 2002). After 
saccade execution up to 50% of neurons encode only the action, 
while around 20% of recorded neurons encode whether the action 
was rewarded or not and close to 40% of neurons are modulated 
by both movement and reward (Kobayashi et al., 2006; Lau and 
Glimcher, 2007). Together, these data suggest that striatal neurons 
response is modulated by action and reward. These responses are 
not limited to the moment of movement or reward receipt; rather 
they are present during cue and during reward expectation. 

Most striatal neurons that respond during task performance 
show higher activity when a reward is expected compared to when 
no reward is expected (Hollerman et al., 1998). However, there 
are also neurons that are active preferentially after the monkey is 
instructed to not move to obtain reward (Hollerman et al., 1998). 
These data suggest that striatal neurons flexibly encode the type 
of action that will produce reward. 

An action-value neuron tracks the value of one action, inde- 
pendent of the performed action. By tracking the value of dif- 
ferent candidate actions and comparing their values an organism 
can decide to exploit the most valuable action or to explore the 
value of other actions. Samejima et al. (2005) were the first group 
to show that striatal neurons code action-value (Figure 2D). 
Neuronal activity tracked over time the value of performing one 
action regardless of the animal's choice. Later, Lau and Glimcher 
(2008) trained macaques to perform a matching task. In this 
task rewards are distributed probabilistically between two options 
and subjects match the frequency with which they choose one 
action with its reward probability (Herrnstein, 1961). This task 
opens the possibility of investigating the presence of action- 
value and chosen-value (i.e., value of the chosen action) neurons. 
Indeed, Lau found that caudate neurons code both action-value 
and chosen-value. These signals can inform decision making 
mechanisms. 
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FIGURE 2 | Action and reward coding by striatal neurons. (A) 

Example striatal neuron active before movement (go) and silent before 
no-movement (no-go). Based on Schultz and Romo (1988), reproduced 
with permission. (B) Example striatal neurons coding reward. First row 
depicts a neuron with phasic active after juice reward delivery 
independent of the action to obtain reward. Second row depicts a 
neuron with tonic activity after juice reward delivery. Third row shows a 
neuron with tonic activity after no reward is delivered. Based on 
Hollerman et al. (1998), reproduced with permission. (C) Example 
caudate neuron coding the conjunction of action and reward. This 



neuron is active during the presentation of a cue indicating the saccade 
necessary to complete the trial if the trial will be rewarded (rewarded 
direction is highlighted by a bulls eye). R, right; U, up; L, left; D, down. 
Polar plots show the average response for each cue and direction. 
Based on Kawagoe et al. (1998), reproduced with permission. (D) (Top) 
Depiction of the probability of larger rewards associated with left or 
right actions on each condition block. Colored numbers refer to the 
probability associated with left-right actions. (Bottom) Example striatal 
neuron coding right action value. Based on Samejima et al. (2005), 
reproduced with permission. 



In conclusion, the striatum contains neuronal activity related 
to movements, rewards and the conjunction of both movement 
and reward. These neuronal representations serve many functions 
like goal directed movements and decision making. 

STRIATAL ACTIVITY DURING SOCIAL BEHAVIOR 
SOCIAL REWARD 

Rewards are events or objects that elicit learning, elicit approach 
behavior and produce positive emotions (Schultz, 2004). Social 
rewards are just like any other rewards with the particular- 
ity that they occur in a social context. We propose a simple 



classification of social rewards using two axes: who acts and who 
receives reward. For example, observing others is a social reward 
(Anderson, 1998; Deaner et al, 2005) where the individual acts 
(observes) and receives reward (the social stimuli). Pro-social 
behavior refers to a preference to increase the welfare of oth- 
ers (Fehr and Camerer, 2007). Depending on individual social 
preferences these choices can be rewarding by themselves, e.g., in 
charitable giving (Harbaugh et al., 2007). Vicarious reward refers 
to the situation when observing someone else receive reward is 
rewarding in itself (Mobbs et al., 2009). Finally, in several social 
rewards the recipient is the individual and the actor is someone 
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else. Examples of other's actions that are rewarding include praise 
and pleasant touch (Francis et al, 1999; Olausson et al., 2002; 
Rolls et al., 2008; Korn et al, 2012). Building a desired repu- 
tation is also considered a social reward; critically, reputation 
depends on other's perception of the individual, not on the indi- 
vidual's perception of herself (Izuma et al., 2008; Izuma, 2012). 
Receiving gifts or social actions that result in own reward can also 
be considered as other-generated social rewards. Social inclusion 
can be considered a social reward and facilitates learning (Eger 
et al., 2013). Although this classification might further our under- 
standing of the neuronal underpinnings of social rewards, further 
experimentation might validate its use. 

Observing others 

Fuelling a brain entails a huge cost, and the ratio of brain size 
to body size is larger in primates than any other Order in the ani- 
mal kingdom (Laughlin and Sejnowski, 2003; Dunbar and Shultz, 
2007). The huge cost of fuelling a large brain begs the ques- 
tion what is the benefit of such large brains? Byrne and Whitten 
suggest that only a costly primate brain can deal with the com- 
plexity of primate social living, the so-called social brain hypoth- 
esis (Dunbar and Shultz, 2007). The primate brain has a great 
deal of specializations to acquire information about conspecifics. 
Neurons in the ventral visual pathway respond selectively to bio- 
logical motion, gaze direction, body parts and faces (Perrett et al., 
1984, 1985a,b; Gross, 1992; Oram and Perrett, 1996; Tsao et al, 
2006). Social information arrives through all senses. For exam- 
ple, the superior temporal polysensory area contains neurons that 
selectively respond to conspecific calls (Perrodin et al, 201 1) and 
local field potentials in the temporal lobe are modulated by face 
or call familiarity (Baez-Mendoza and Hoffman, 2009). The vol- 
ume of gray matter correlates with the size of the individual's 
troop in mid superior temporal sulcus, inferotemporal cortex, 
rostral superior temporal sulcus, amygdala — all areas involved 
in perceiving individuals — and rostral PFC in macaques (Sallet 
et al., 2011). These findings suggest that the brain has special- 
ized structures dealing with the acquisition and representation of 
information about conspecifics. 

If the brain has specialized structures for the acquisition and 
representation of information about conspecifics, then acquir- 
ing this information must be valuable for the individual. In a 
clever paradigm Deaner and colleagues measured the value of 
acquiring access to observe pictures of conspecifics (Deaner et al., 
2005). They pitted a constant amount of juice against a variable 
amount of juice plus the opportunity to observe the picture of a 
conspecific. The monkeys made their choices depending on the 
amount of juice offered along with the picture. If the monkey 
chose a smaller amount of juice plus the opportunity to watch 
an image, it strongly indicated that the monkey valued watching 
the image equivalent to the difference between offered juice vol- 
umes. For example, a monkey that likes watching a high-ranking 
monkey will choose watching the image and receiving 0.8 ml of 
juice vs. only receiving 1ml of juice. When the monkey chose with 
equal probability between the two alternatives then the difference 
in offered juice volume is the subjective value for observing the 
image, the so-called point of subjective equivalence. Researchers 
using this method can measure the subjective value of varying 



juice magnitudes (fluid value) and that of social images (image 
value). Another advantage of this method is that it facilitates the 
comparison of different goods (Glimcher, 2010), e.g., observing 
female perinea or a subordinate male face. Using this method 
Deaner and colleagues reported that male monkeys valued highly 
looking at dominant monkeys and the perinea of female monkeys 
compared to looking at subordinate monkeys or a non-salient 
visual stimulus (Deaner et al, 2005). 

Neuronal activity during this task has been measured in dif- 
ferent brain regions. LIP neuronal activity correlates with both 
image value and fluid value when the monkeys chose to look at the 
image (Klein et al., 2008). OFC neurons showed distinct coding 
of reward magnitude or image value, but not both (Watson and 
Piatt, 2012). Thus, these results suggest that OFC neurons do not 
code reward on a single currency (e.g., in juice volume), rather as 
different variables, as shown before (O'Neill and Schultz, 2010). 
Intriguingly, these animals strongly preferred looking at pictures 
of subordinates, a finding at odds with previously reported strong 
preferences for dominant faces in the same paradigm (Deaner and 
Piatt, 2003; Deaner et al., 2005; Shepherd et al, 2006; Klein et al, 
2008); but this result suggests that the encoding of social reward 
reflects subjective preferences. 

Neurons in the anterior striatum showed an interesting 
response pattern in the same paradigm (Klein and Piatt, 2013). 
The large majority of reward responsive neurons were selective 
for reward type. These neurons also showed a regional pattern: 
those in the caudate were more strongly modulated by social 
reward, conversely, putamen neurons were more strongly modu- 
lated by liquid reward. This pattern can be alternatively explained 
by simple saccade direction coding because caudate neurons are 
tuned for saccade direction, particularly for contralateral saccades 
(Hikosaka et al, 1989a). 

Humans also value observing other humans; and among dif- 
ferent targets we value highly observing our romantic partners 
and mothers (Bartels and Zeki, 2000, 2004; Aron, 2005; Acevedo 
et al., 2012). Observing pictures of a partner elicits higher blood 
oxygenated level-dependant (BOLD) activity in caudate/putamen 
and VTA along with cingulate and insular cortex compared to 
viewing pictures of friends matched for age, gender and length-of- 
friendship as their partners (Figure 3, green squares). This effect 
is present either when the relationship is recent (Aron, 2005) 
or when has been long established (Acevedo et al., 2012). These 
BOLD responses are a neural correlate of the value of observing a 
loved one. 

In summary, acquiring social information, in particular look- 
ing at conspecifics, is valuable for the individual (Deaner et al., 
2005). The primate temporal lobe contains regions whose func- 
tion includes the processing of social information (Tsao et al., 
2006; Perrodin et al., 2011). Both social information and value 
converge in the striatum, opening the possibility of social reward 
coding in this brain region — as shown by Klein and Piatt (2013). 

Other social rewards 

A positive reputation is a social reward as it can elicit learning, 
approach behavior and positive emotions. This is particularly evi- 
dent in indirect reciprocity: a donor who helps a recipient in 
public might receive in the future a donation from someone that 
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FIGURE 3 | fMRI studies of social behaviors in which the striatum is 
active. Peak activation coordinates in the striatum of the fMRI studies 
cited in this review color-coded for each section as illustrated in the 
legend. Studies using a region of interest analysis strategy were not 
included in this image. These striatal responses are compatible with a 
general activation in response to social behaviors, including social rewards. 
A functional subdivisions according to types of social rewards need to 
await further experiments. Studies aggregated in "Other social rewards": 

has observed its "altruistic" behavior (Nowak, 2006). Obtaining 
a good reputation from others increases BOLD activity in the 
human striatum (Izuma et al., 2008; Korn et al., 2012) (Figure 3, 
red squares), but not in individuals diagnosed with autism (Izuma 
et al., 2011). This difference is likely due to insensitivity to social 
rewards in autistics (Dawson et al., 1998; Schultz, 2005). 

Other social rewards that also increase BOLD activity in 
the striatum include charitable donations (Moll et al, 2006; 
Harbaugh et al, 2007) and observing someone else succeed 
(Mobbs et al., 2009). Vicarious reward is also modulated by the 
closeness of the recipient: there is higher striatal BOLD activ- 
ity when sharing a monetary gain with close friends compared 
to sharing with strangers, and sharing with the latter is associ- 
ated with higher activations compared to when the "recipient" is 
a computer (Fareri et al., 2012). This social vs. non-social effect 
has also been observed when cooperating with a human partner 
vs. cooperating with a computer (Rilling et al., 2002). The peak 
activations from studies cited in this section are illustrated with 
red squares in Figure 3. Taken together, these data suggest that 
social rewards are associated with BOLD activity in the striatum 
and can be modulated by the social context. 

LEARNING ABOUT SOCIAL AGENTS 

Social life is rife with opportunities to learn about others. For 
example, we learn to trust or mistrust other people. The trust 
game is an economic game that measures how trust is built 
between two individuals. During the trust game the investor 
receives an initial endowment that she can choose to invest in 
a trustee, the trustee receives three times the investment and 
decides how much of the gains to return to the investor. When 
this game is played iteratively the investor learns to trust (or 
mistrust) the trustee and vice versa. Thus, both players develop 
a model of the other's reputation (King-Casas et al., 2005). To 
build a trust model investors use previous behavior to predict 
future behavior. If there is a deviation from what is predicted — 
a reward prediction error — then the model is updated. Activity in 
dorsal striatum mirrored prediction errors during the repayment 



(Rilling et al., 2002; Moll et al., 2006; Izuma et al., 2008; Mobbs et al., 
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phase (Figure 3, yellow squares; King-Casas et al, 2005). When 
an investor returned more than what a trustee expected the 
trustee reciprocated by increasing her investment. During the 
investment phase activity increased in middle cingulate cor- 
tex of the investor and also in ACC of the trustee. Activity 
in both areas correlated with activity in the trustee's caudate; 
most importantly the peak of these correlations shifted from the 
repayment epoch to the investment epoch (King-Casas et al., 
2005). These results suggest that generating someone else's rep- 
utation engages a reinforcement learning algorithm that uses 
prediction errors and the latter are reflected in striatal BOLD 
activity. 

Prior information about someone's trustworthiness sets the 
initial state of the trust model. This initial bias can be overruled 
by observing someone's willingness to reciprocate trust (Figure 3, 
yellow squares; Delgado et al., 2005; Phan et al, 2010; Fouragnan 
et al., 2013). Prior information diminishes the magnitude of the 
reward prediction error signal in the striatum during the repay- 
ment phase (Fouragnan et al., 2013). Following advice to solve a 
task (a type of prior information) generates an outcome-bonus in 
a version of the Iowa gambling task (Biele et al, 201 1 ). These stud- 
ies suggest that prior information not only sets the initial state of 
the trust model, but it has a long lasting effect on its computation. 

Depth-of-thought refers to a person's inference about some- 
one else's intention and to how many iterations of this inference 
they perform (Dixit and Skeath, 2004). Players in the trust game 
solve the game with different levels of depth-of-thought (Xiang 
et al., 2012). If the investor makes no inference about the trustee's 
intention to reciprocate, then a prediction error occurs when 
the trustee does not reciprocate trust. This prediction error is 
reflected in increased striatal activity (Figure 3, yellow squares; 
Xiang et al., 2012). If the investor infers that he plays this game 
against a trustee that infers what he will offer, then the predic- 
tion error occurs when the investor submits its investment to the 
trustee; again, the striatum reflects this prediction error (Xiang 
et al, 2012). Thus, the computation of prediction errors, during 
the trust game, depends on depth-of-thought. 
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Oxytocin, a neuropeptide, also modifies how we update 
the trust model. Intranasal administration of this neuropep- 
tide increases the rate of trust decisions compared to placebo, 
even after repeated violations of trust (Kosfeld et al., 2005). 
Correspondingly, people that received oxytocin showed a smaller 
negative prediction error signal in the striatum after repeated 
violations of trust (Baumgartner et al., 2008). Although the dis- 
tribution of oxytocin receptors in the human brain is unknown, 
one possible locus where oxytocin modifies trust is in the stria- 
tum (see section "Involvement of the Striatum in Pair-Bond 
Formation and Maintenance" below). 

Social life is also rife with opportunities to learn from oth- 
ers. Observational learning is another social cognitive process 
that can be modeled with reinforcement learning. Burke and col- 
leagues hypothesized that observational learning is composed of 
two prediction errors, an action observation prediction error and 
an outcome observation prediction error (Burke et al., 2010). In 
their task two individuals took turns to learn which one of two 
decks of cards provided a better outcome. In order to disentan- 
gle individual learning from imitation learning and observational 
learning the individuals performed the task in three conditions: 
other's actions and outcomes were private, only the other's out- 
come was visible and both the partner's action and outcome were 
observable. Burke and colleagues found a correlate for action 
observation prediction error in dorsolateral prefrontal cortex 
(DLPFC) and for outcome observation in ventromedial pre- 
frontal cortex (VMPFC) and ventral striatum (Figure 3, yellow 
squares). Specifically, VMPFC activity correlated positively and 
ventral striatum correlated negatively with the outcome observa- 
tion prediction error (Burke et al., 2010). Thus, they found neural 
correlates of observational learning in frontal cortex and ventral 
striatum. 

In conclusion, the neuronal mechanism of learning to trust 
someone else or from someone else is based on a reinforce- 
ment learning algorithm. This algorithm makes predictions about 
other's behavior and prediction errors help to update the model. 
The type of predictions depends on depth-of-thought and prior 
information modifies the rate to which the model is updated. 
These learning signals are reflected in changes in BOLD activity 
in the striatum. 

INEQUITY AND FAIRNESS CONSIDERATIONS 

Inequity arises from an asymmetric distribution of resources 
between two or more conspecifics. Classic economics assumes 
that agents always intend to maximize their own benefit regard- 
less of other's wellbeing (Von Neumann and Morgenstern, 1947). 
However, the difference in resource distribution can have a neg- 
ative impact on the utility and subjective value of an object 
(Loewenstein et al., 1989; Fehr and Schmidt, 1999). The disu- 
tility from an unequal outcome depends on who obtains more 
resources. When the agent receives more than the conspecific, 
we speak of advantageous inequity. Conversely, when the agent 
receives less than the conspecific we speak of disadvantageous 
inequity. 

Interestingly, humans choose to lower their own payoff so that 
inequity is smaller, a so-called pro-social behavior. For exam- 
ple, when people donate money to charity they diminish their 



wealth so that others can be better off (Harbaugh et al., 2007). 
Disadvantageous inequity, having less than others, can have a 
negative effect in behavior. For example, progressive taxation is 
designed to reduce income inequality by implementing higher 
taxes on higher earners (Wilkinson and Pickett, 2010). An influ- 
ential hypothesis of how people react to inequity (Fehr and 
Schmidt, 1999) posits that unequal payoffs are aversive, there- 
fore agents try to minimize them. This theory has its roots on 
the idea that one can estimate social utility functions that spec- 
ify level of satisfaction as a function of outcome to self and other 
(Loewenstein et al., 1989). Other example theories where social 
utility functions help to explain human preferences that devi- 
ate from pure maximization include "Equity, Reciprocity, and 
Competition" by Bolton and Ockenfels (Bolton and Ockenfels, 
2000) and "Fairness" by Rabin (Rabin, 1993). 

One experimental task commonly used to measure advan- 
tageous inequity aversion is the dictator game (Forsythe et al., 
1994). In this task the person playing as dictator receives an ini- 
tial financial endowment and decides to give an amount of the 
endowment to a receiver. The neoclassical assumption of ratio- 
nal behavior predicts that dictators will not give away anything 
of their payoff; however, dictators usually give away between 5 
and 25% of their initial endowment (Forsythe et al., 1994). It is 
assumed that the proportion of money given to the receiver is a 
measure of the disutility for the dictator of having more than the 
other (Gibbons, 1992; Camerer et al., 2004). To measure disad- 
vantageous inequity aversion scientists use the ultimatum game 
(Giith et al., 1982). In this game the proposer receives an endow- 
ment and proposes a split to the responder, just as in the dictator 
game. The responder then either rejects the split, thereby forgoing 
all monies, or accepts it. Neoclassical economic models predict 
that the responder will accept any split that results in him hav- 
ing more than nothing. However, responders tend to only accept 
splits where they obtain more than 30% of the initial endowment 
(Giith et al., 1982). The responder 's minimum acceptable offer 
is the percentage of the initial endowment that he is willing to 
accept 50% of the time (Camerer et al., 2004). This last parameter 
is directly proportional to the degree of disadvantageous inequity 
aversion. 

When subjects play the dictator game as dictators the ventral 
striatum is active when deciding to donate money to a charity 
(Moll et al, 2006; Harbaugh et al, 2007) and when enacting 
the decision on how to distribute a good between two chari- 
table possibilities (Hsu et al, 2008). The relative wealth of the 
donor and the receiver also matter to how the brain responds 
to these decisions. After one of two volunteers is made better- 
off than the other volunteer, the worse-off volunteers ranked 
receiving money much more appealing than their better-off coun- 
terparts (Tricomi et al., 2010). Accordingly, ventral striatum and 
VMPFC show higher activity during transfers to self than to the 
other. Better-off volunteers found more appealing that the other 
received money than themselves. Ventral striatum and VMPFC 
reflected this preference: both brain regions showed higher activ- 
ity during transfers to other than to self (Tricomi et al., 2010). 
In a related experiment, Fliessbach and colleagues paid in differ- 
ent ratios to pairs of volunteers for correctly completing a simple 
task while they were in an MRI scanner (Fliessbach et al, 2007). 
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Ventral striatum activity was positively correlated with the ratio 
of the payoff regardless of the actual personal monetary payoff. 
Furthermore, striatal activity was lowest during own errors and 
highest during other's errors. Such a social contrast has been con- 
firmed, e.g. activity in ventral striatum is higher after winning a 
lottery in public vs. winning the same amount in private (Bault 
et al., 2011). The peak activations from the fMRI studies cited 
in this section are illustrated in Figure 3 with pink squares. Thus, 
these data suggest that the striatum reflects the difference between 
own and other's rewards. 

AGENCY CODING IN STRIATAL NEURONS 

Reciprocal social interactions provide the opportunity to increase 
fitness through repeated exchanges with a particular individ- 
ual, although one of its by-products is reward inequality. For 
this interaction to be successful several mental processes need 
to take place (Axelrod and Hamilton, 1981): both participants 
need to identify their partner, assign agency for the current out- 
come, decide how to act depending on the series of events and 
keep a tally of the recent exchanges. Without partner identifi- 
cation reciprocity is virtually impossible (unless all interactions 
take place with a uniform population) (Dawkins, 2006). Without 
a memory trace of the outcomes of the recent exchanges, par- 
ticipants might see themselves locked onto a "one-way street" 
reciprocal exchange. Agency assignment allows the individual to 
assign credit (or blame) for a shared outcome (Wolpert et al., 
2003; Tomlin et al., 2006). With precise agency assignment in 
the memory of recent exchanges individuals can avoid free rid- 
ers (Dawkins, 2006). Therefore, agency assignment is a trait that 
might have been favored by evolution in social animals. 

Another way to frame the problem of agency assignment is 
to think of it as the "social" extension of the credit-assignment 
problem (Figure 4A). Let us revise what the credit-assignment 
problem is. In order for an action to be reinforced, it needs to 
be selected from various actions made between the operant and 
the reinforce. The organism needs to assign credit to the oper- 
ant, and not assign (or subtract) credit to other non-contingent 
actions (Sutton and Barto, 1998). This is done by changing the 
weights of different eligibility traces, or memories of past actions 
(Sutton and Barto, 1998). The agency credit assignment problem 
applies when more than one actor can generate a reward (Tomlin 
et al, 2006). Thus, the agency credit assignment problem can be 
cast by paraphrasing Sutton and Barto (1998): how do you dis- 
tribute credit for success among the many actors that may have 
been involved in producing it? 

The striatum is well-suited for integrating social action (an 
action made in a social context) and reward given its anatom- 
ical connections and known role in action and reward coding. 
We recorded striatal neuron's activity while an animal performed 
a reward giving task with a conspecific in order to investigate 
the interaction of social action and reward (Baez-Mendoza et al., 
2013). The reward giving task is an extension of the paradigm 
described by Hollerman et al. (1998) to encompass several social 
dimensions. In the original paradigm the activity of striatal neu- 
rons was tested for relationships to movement vs. no-movement 
and reward vs. no-reward. In our task we tested if striatal neu- 
ron activity was related to own vs. conspecific's movement and 



own and/or conspecific's reward. During the experiment two 
monkeys sat opposite each other across a table with a touch- 
screen. Both animals took turns to complete the following task: 
the actor held a resting key with its right arm, the computer 
presented two simultaneous cues predicting reward (circle) or 
no reward (square) separately for each animal (Figure 4B), fol- 
lowed by a blue go signal eliciting the actor's arm movement for 
touching it (Figure 4B). After a brief delay, the computer deliv- 
ered reward to the actor and then to the conspecific. We were 
able to probe the neuronal correlates of agency and reward cod- 
ing by varying reward presence and absence for both players and 
who performed the task. This simple test allowed us to test the 
neuronal mechanisms of a complex cognitive process. 

Our first concern was whether the monkeys were sensitive to 
the social nature of the task. Reaction times and eye fixation anal- 
ysis suggested that the monkeys were sensitive to reward received 
by themselves and their conspecific. Importantly, the animals 
were less likely to move whenever it was the conspecific's turn, 
suggesting that they had an understanding of the turn-taking 
structure of the task. This is particularly relevant for agency credit 
assignment because during "own turns" the animal should have 
assigned credit to itself for own reward and during "conspecific's 
turns" to the conspecific. 

Own reward modulated the activity of striatal neurons, as pre- 
viously observed (Hikosaka et al, 1989b; Apicella et al., 1991a); 
but few striatal neurons responded to conspecific's reward. 
Interestingly, a sub-population of neurons differentiated between 
social actors, with some neurons firing more strongly during 
one of the actor's turn. Given these types of neuronal modula- 
tions, we then looked at the neurons' sensitivity to whose turn 
it was. A large number of own reward coding neurons reflected 
the social actor: some neurons responded to own reward only 
when the recorded animal acted (Figure 4C) whereas a different 
sub-population responded to own reward when the conspecific 
acted (Figure 4D). We tested a series of alternative hypothesis for 
these data including: eye position, response inhibition, temporal 
discounting and reward cost, none of which were a satisfactory 
explanation of the data. 

We also found a collection of neurons that reflected whose 
trial it was. These neurons fired more strongly during own trials 
than conspecific's trials, or vice versa: conspecific > own tri- 
als. These neurons reflected social action as they differentiated 
between actors. To test whether these neurons truly reflected a 
"social" component of the task we measured their activity while 
the animal performed the task with the conspecific or a non- 
social juice recipient (an empty bucket). If a neuron is modulated 
by the social component of the task, then it should stop differ- 
entiating between actors during the "bucket test." This test for 
social-specific coding indicated that close to 50% of social actor 
coding-neurons were indeed modulated by the social environ- 
ment. This is, to our knowledge, the first direct test of a neuronal 
correlate of social behavior in single neurons. 

These experiments showed that there are multiple signals in 
the striatum relevant for social interactions. The data suggests an 
extension of the known role of the striatum in movement and 
reward processing into the social domain. Several questions arise 
from these findings. 
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FIGURE 4 | Agency credit assignment cartoon and striatal neurons 
coding social action and own reward. (A) Once the monkey receives a 
banana it needs to know which action produced reward to assign credit. 
The action can be its own (solid lines) or someone else's (dashed lines). 
Many actions take place before reward is delivered, therefore looking at a 
memory of each action or eligibility trace (brown arrows) can solve the 
agency credit assignment problem. (B) Task sequence for the actor: shape 
of conditioned cue predicted absence or presence of reward for each 
animal. Appearance of a subsequent blue go signal was followed by key 



release, stimulus touch and reward for actor, and later for conspecific. After 
the ITI the monkeys switched roles as actor and passive. (C) Single striatal 
neuron coding own action and own reward. Note the higher neuronal 
activity during own action and own reward compared to own reward 
absence and conspecific's actions. (D) Single striatal neuron coding social 
action and own reward. This neuron is active during conspecific's actions 
that will result in own reward, a complement to the neuron shown in (A). 
Monkey picture by smerikal (Flickr), reproduced with permission. Panels 
(B-D) based on Baez-Mendoza et al. (2013), reproduced with permission. 



How are these signals formed? One possible mechanism is 
as follows: Striatal neurons receive biological motion informa- 
tion either directly from area STP (Oram and Perrett, 1996) or 
indirectly via parietal lobe (Cavada and Goldman-Rakic, 1991) 
while simultaneously receiving reward-related information from 
dopaminergic neurons and other reward-related areas (Haber 
and Knutson, 2010, see also Figure 1). Converging inputs and 
local interactions (Chuhma et al., 2011) are also well-suited to 
combine information about other's actions and own reward. 
Future experiments will test and measure the formation of 
agency and reward conjoint coding in the population of striatal 
neurons. 

Another issue is: how are these signals used? We hypothesize 
that this neuronal signal may help assign, and maintain, credit to a 
social agent when receiving reward in a social context. Solving this 
problem is necessary for successful interactions. It is possible the 
striatum provides a signal to distribute credit for reward among 
the many actors that may have been involved in producing it. One 
key experiment would test the individual-specificity of this signal: 
is the signal specific for one individual or it only discriminates 
between own action and "other's" actions? Such a fine grained 
signal would aid in discriminating who is a better partner and 
who is not. 



SOCIAL CONTACT AND STRIATAL FUNCTION 

The striatum is involved in other social behaviors besides social 
action, social reward and reward inequity. Social isolation and 
social defeat compromise the normal function of the striatum. 
These effects highlight the interplay between normal social con- 
tact and striatal function. Social isolation has long-lasting effects 
in behavior, neuronal anatomy and neurochemistry. For example, 
social deprivation in the first year of life of macaques is related 
to abnormal social behaviors including fearfulness, withdrawal, 
lack of play, apathy, indifference to external stimuli, deficien- 
cies in communication and aggression (Martin et al., 1991). 
Macaques reared in social deprivation show decreased numbers 
of caudate/putamen neurons reactive to substance P, tyrosine 
hydroxylase (TH), leucine-enkephaline, and calbindin; in con- 
trast, the number of somatostatin interneurons did not differ to 
normally-reared conspecifics. TH staining was reduced in SNc 
but neuron numbers were stable. Other subcortical regions were 
unaffected, including the NAcc, amygdala and BNST (Martin 
et al., 1991). Further characterization of the behavioral, anatomi- 
cal and neurochemical effects of social isolation have been carried 
out in rodents. 

Social isolation leaves consistent behavioral effects on 
rodents. These include hyper-reactivity to novel environments, 
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a reduction in the pre-pulse inhibition of the acoustic startle, and 
an increase in aggressive behavior (reviewed by Fone and Porkess, 
2008). Also, studies of the neuroanatomy of isolates' brains 
describe changes in cortical and subcortical neuronal circuits. For 
example, after social isolation rats showed decreased dendritic 
spine density in prefrontal cortex and hippocampus compared to 
socially-housed littermates (Silva-Gomez et al., 2003). There are 
several reports on differences in neurotransmitter systems, for a 
systematic review see (Fone and Porkess, 2008). Of particular rel- 
evance to this review, the dopaminergic system of socially isolated 
rats is different to that of socially-housed animals. 

Although socially isolated rats show normal basal levels of 
extracellular dopamine (DA) in the ventral striatum, systemic 
administration of d-amphetamine produces a significant increase 
in DA release compared to socially-reared rats (Wilkinson et al., 
1994; Hall et al, 1999). Furthermore, isolation-reared rats show 
an increase in DA turnover and in hyper-locomotion induced by 
d-amphetamine (Hall et al., 1998). Injections of cocaine increase 
DA efflux in ventral striatum, an effect potentiated by isola- 
tion rearing (Howes et al, 2000). Intriguingly, isolates acquire 
faster operant responding to obtain low doses of cocaine but 
their acquisition is slower for higher doses compared to socially- 
housed rats (Howes et al., 2000). Deficits in pre-pulse inhibition 
of the acoustic startle in socially-isolated rats are reversed by 
administration of the D2 receptor antagonist raclopride (Geyer 
et al., 1993). DA depletion in ventral striatum after administra- 
tion of 6-hydroxydopamine also facilitates pre-pulse inhibition in 
socially-isolated rats (Powell et al., 2003). Interestingly, basal lev- 
els of extracellular DA in ventral striatum do not differ between 
socially-isolated and socially-reared rats (Wilkinson et al, 1994; 
Hall et al, 1999; Howes et al, 2000). These results suggest that 
basal mesolimbic DA is unaffected by social isolation, rather the 
ventral striatum is "hypersensitive" to events that naturally trigger 
DA release. 

One candidate mechanism for the hypersensitive ventral stria- 
tum of socially-isolated rats is a difference in receptor levels. Yet 
some groups report no changes in D 1 or D2 receptor density or 
affinity in striatum (Bardo and Hammer, 1991; Del Arco et al, 
2004); while others report an increase in D2 binding (Djouma 
et al., 2006). Changes in housing condition, however, modify the 
levels of D2 receptors in the monkey striatum (Morgan et al., 
2002). Specifically, after monkeys were socially housed, dom- 
inant monkeys had higher levels of D2 receptors in striatum 
compared to when they were housed individually and to subor- 
dinates. Interestingly, subordinates consumed more and worked 
more for intravenous injections of cocaine than dominant mon- 
keys (Morgan et al., 2002). This finding is further supported by 
a negative correlation between the baseline levels of D2 receptors 
and the rate of cocaine self-administration and a decrease in D2 
receptor levels with chronic cocaine use (Nader et al., 2006). Thus, 
these results suggest that D2 receptor density can be modified by 
changes in the social environment. 

Changes in social hierarchy result in winners and losers: lower 
ranking individuals were usually defeated by their conspecifics 
and lost their rank. After losing one or more encounters with a 
conspecific, mesostriatal transmission is modified in the defeated 
individual. Tidey and Miczek (1996) reported that rats that were 



defeated by a conspecific, showed higher concentrations of extra- 
cellular DA in ventral striatum and prefrontal cortex during a 
social encounter with a dominant rat compared to baseline. If 
rats remained isolated after being defeated, the number of stri- 
atal dopamine transporter (DAT) binding sites was reduced, while 
there were no changes in DAT in animals that returned to the 
familiar group (Isovich et al., 2001). A potential role of levels 
of DAT in regulation of social behavior is suggested by a report 
of DAT knockout mice which exhibited increased rates of reac- 
tivity and aggression following mild social contact (Rodriguiz 
et al, 2004). Mice who experienced chronic social defeat avoid 
making contact with conspecifics and show increased levels of 
brain derived neurotrophic factor (BDNF) in the NAcc up to 4 
weeks after the last defeat (Berton et al., 2006). BDNF potenti- 
ates DA release in the NAcc by acting in pre- and post-synaptic 
sites (Russo and Nestler, 2013). The major source of BDNF in 
NAcc is dopaminergic neurons in VTA. BDNF deletion in these 
cells of chronically-defeated mice results in an increase in social 
contact, suggesting that BDNF plays a key role in the main- 
tenance of the social defeat phenotype (Berton et al., 2006). 
These selected studies highlight that mesolimbic dopaminer- 
gic transmission is modified following acute or chronic social 
defeats. 

In conclusion there are behavioral, anatomical and neuro- 
chemical consequences of social isolation. There is a marked 
reduction in the number of striatal interneurons, but basal lev- 
els of extracellular DA remain unchanged. There is no consensus 
whether there are changes in DA receptor levels in the striatum, 
but other signaling systems (BDNF) and molecular mechanisms 
(changes in DAT) are involved. This snapshot of studies on the 
relationship between social housing conditions, behavior and 
basal ganglia function suggest that this is not a simple relation- 
ship. Notwithstanding, it can be concluded that social isolation 
and social defeat result in changes in neurotransmission to the 
mesolimbic circuit. 

INVOLVEMENT OF THE STRIATUM IN PAIR-BOND FORMATION AND 
MAINTENANCE 

Sex is a primary reward and it is the basis of pair-bond formation 
in voles. The striatum is part of the neuronal circuitry underlying 
a remarkable pair-bond formation in which both partners remain 
monogamous. It is important to note that the role of the striatum 
extends beyond that of movement and reward. Studies on vole 
pair formation provide an interesting example of the interaction 
between social behavior and striatal function. 

There are two similar species in the same genus: one of 
which is monogamous and the other promiscuous. Prairie voles 
(Microtus ochrogaster) form life-long bonds with their first mate, 
remain monogamous and live in burrows with extended fam- 
ilies; meadow voles {Microtus pennsylvanicus), in contrast, are 
a promiscuous species often living in solitary burrows (Insel, 
2010). This natural dissociation in pair formation provides the 
opportunity to tap into the neurobiology of social behavior. 

The interplay of oxytocin, arginine-vasopressin and DA play 
a pivotal role in pair formation in voles. Administration of 
haloperidol — an unselective DA inverse agonist — in male prairie 
voles' NAcc prevents partner preference, whilst stimulating 
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D2-like receptors in caudate-putamen induces partner prefer- 
ence in the absence of mating (Aragona et al., 2003, 2006). 
Conversely, DA Dl-like receptor activation prevents pair-bond 
formation (Aragona et al., 2006). This mechanism is similar in 
females, since D2-like receptor stimulation induces partner pref- 
erence whereas administration of a Dl-like agonist had no effect 
(Wang et al., 1999). Vasopressin Via receptor gene transfer into 
the ventral pallidum of polygamous meadow voles is sufficient to 
induce pair-bond-like behavior after mating (Lim et al., 2004b). 
Similarly, overexpression of oxytocin receptor in NAcc facilitated 
partner preference in female prairie voles but has no effect in 
parental care, nor any effect on female meadow voles (Ross et al., 
2009). Prairie voles have a high density of oxytocin-receptors in 
the NAcc and of vasopressin Via receptors in the ventral pallidum 
compared to meadow voles (Insel and Shapiro, 1992; Hammock 
and Young, 2006). Interestingly, oxytocin-receptors are bound 
by oxytocin, and with lower affinity, vasopressin (Gimpl and 
Fahrenholz, 2001). Interestingly, there are no differences in the 
distribution of Dl-like and D2-like receptors in the striatum 
between these two species (Lim et al, 2004a). Thus, these results 
suggest that the differential distribution of oxytocin and vaso- 
pressin receptors is responsible for pair-bond formation. In con- 
clusion, pair-bond formation is modulated by the interaction of 
oxytocin, vasopressin and DA in NAcc neurons as well as the 
distribution of oxytocin and vasopressin Via receptors. 

The role of oxytocin and vasopressin in social recognition is 
supported further by the absence of habituation to conspecifics in 
oxytocin and Vla-R knockout mice (Ferguson et al., 2000; Bielsky 
et al, 2004). Oxytocin knockout mice "recover" social habituation 
after infusion of oxytocin agonists in central amygdala (Ferguson 
et al., 2001). Similarly, local infusion of Vla-R antagonists in lat- 
eral septum of rats inhibits habituation to conspecifics (Everts and 
Koolhaas, 1999). Thus, both oxytocin and vasopressin regulate 
social recognition. 

The endogenous opioid system is another neuronal mecha- 
nism that may play a role in pair-bond formation. Mu-opioid 
receptor (MOR) activation modulates partner preference in 
female prairie voles (Burkett et al., 2011). MOR density is striatal 
region specific, thus this effect is probably mediated by specific 
striatal regions (Resendez et al., 2013). MORs within the dorsal 
striatum mediate partner preference formation via impairment of 
mating, whereas receptors in NAcc appear to mediate pair bond 
formation through the positive hedonics associated with mating 
(Resendez et al., 2013). Interestingly, monogamous voles show 
higher MOR density in forebrain including the caudate-putamen 
and NAcc than the closely-related polygamous voles (Inoue et al., 
2013), but see (Insel and Shapiro, 1992). Thus, interspecies dif- 
ferences in opiate receptor density and pharmacological effects 
suggest a role of opiates in social attachment. 

A relevant question is how and where these neurotransmit- 
ter systems interact. Rat NAcc core neurons expressing Dl-like 
receptors co-express prodynorphin, conversely D2-like express- 
ing cells co-express proenkephalin (Curran and Watson, 1995). 
An electron microscope investigation indicates that about half 
of neurons in the rat dorsolateral striatum co-express D2 and 
MORs (Ambrose et al., 2004). These anatomical studies support 
the possibility that oxytocin, vasopressin and D2-like receptors 



are present in single striatal cells, yet their interactions remain to 
be further investigated. 

Little is known about pair-bond formation in primates. 
However, marmosets, a monogamous new-world monkey, show 
oxytocin receptor labeling in NAcc among other subcorti- 
cal structures (Schorscher-Petcu et al., 2009), whereas rhesus 
macaques, a polygamous old-world monkey, only show label- 
ing for this receptor in hypothalamus and the nucleus basalis of 
Meynert (Freeman et al., 2012). Titi monkeys are a monogamous 
species that exhibit small, but significant, changes in glucose 
intake in the NAcc and ventral pallidum 48 hr. after mating (Bales 
et al, 2007). 

Whereas we have learned about pair-bond formation, the neu- 
ronal mechanisms of pair-bond maintenance are just starting to 
be investigated. For example, monogamous male voles show a 
significant increase in Dl-like receptors in NAcc after pair-bond 
formation, and Dl-like receptor antagonists diminish aggressive 
behavior toward female strangers — a behavioral marker of pair 
bond formation (Aragona et al., 2006). This is probably the most 
exciting open question in pair-bond formation, what are the 
neuronal mechanisms of pair-bond maintenance? 

The striatum might also play a role in mother's recognition 
of offspring. The pregnancy hormones progesterone and oestro- 
gen prime the brain for the synthesis of oxytocin and its receptor 
(Keverne and Curley, 2004). Olfaction is the prime sense for 
maternal offspring recognition in mammals. Oxytocin receptors 
expression increases in central olfactory projections and NAcc 
during pregnancy (Keverne and Curley, 2004). 

Overall, these studies suggest a mechanism for pair-bonding 
formation in voles. The hypothetical mechanism is centered in 
the striatum's capability to facilitate the association between olfac- 
tory social cues and reward. A potential mate's pheromones reach 
the vomeronasal organ (VNO), which in turns transmits the indi- 
vidual's information to the extended amygdala and the central 
amygdala further transmits this information to striatum. VNO 
lesions in female voles disrupt pair formation (Curtis et al., 2001), 
a finding that supports this hypothetical mechanism. However, 
other brain areas may also play a role in pair-bond formation. 
For example there are marked differences in the distribution of 
dopamine, oxytocin and vasopressin receptors in the medial pre- 
frontal cortex of monogamous and promiscuous voles (Smeltzer 
et al, 2006). As noted by Wang and Young (Lim et al, 2004b; 
Young and Wang, 2004), the cellular mechanism might be the 
co-activation of D2-expressing accumbal neurons by vasopressin 
and/or oxytocin. Oxytocin is released by the hypothalamus, odor 
information transmitted from the central amygdala and DA is 
released by dopaminergic neurons in VTA. Striatal neurons are 
well-suited for detecting the conjunction of sensorimotor infor- 
mation and reward. In pair-bond formation the role of the 
striatum, particularly the NAcc is to facilitate the association of 
social cues and reward to guarantee reproductive success. 

CONCLUSIONS 

Based on the studies reviewed here, we conclude that the stria- 
tum plays a role in computations that take place during social 
behavior. These computations revolve around social actions and 
social rewards. fMRI and neurophysiology studies show that 
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neural activity in the striatum is modulated by social rewards 
and by learning in a social context (Figure 3). By learning in 
this context we refer to: learning about other's preferences, a new 
mate, about other's actions that lead to own reward, or updat- 
ing our predictions about other's preferences. We have shown 
that neuronal activity in the striatum is also modulated by social 
actions and, critically, by the conjunction of social action and own 
reward (Figure 4). The computations performed by the stria- 
tum are critical for successful social interactions. A breakdown in 
social interactions leads to compromised striatal function, which 
highlights the interplay between this neuronal circuit and social 
behavior. 

Overall, these observations suggest that the striatum does not 
appear to have a particular "social" specialization; rather its neu- 
rons are capable of flexibly incorporating social information into 
their computations. Therefore, it is justified to speak of the stria- 
tum as containing a general purpose neuronal mechanism to 
associate actions or events with reward. Importantly, it can also 
associate — or reflect — other's actions to the rewards they lead to. 
Rewards are also coded in the activity of striatal neurons, and as 
social rewards are a sub-class of rewards, they are processed in the 
striatum. Importantly, a functional subdivision based on different 
types of social behaviors need to await further experimentation. 
In conclusion, the striatum plays a role in the computation of 
social behavior. 
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